food type
What's on Your Plate? Inferring Chinese Cuisine Intake from Wearable IMUs
Yin, Jiaxi, Wang, Pengcheng, Ding, Han, Wang, Fei
Accurate food intake detection is vital for dietary monitoring and chronic disease prevention. Traditional self-report methods are prone to recall bias, while camera-based approaches raise concerns about privacy. Furthermore, existing wearable-based methods primarily focus on a limited number of food types, such as hamburgers and pizza, failing to address the vast diversity of Chinese cuisine. To bridge this gap, we propose CuisineSense, a system that classifies Chinese food types by integrating hand motion cues from a smartwatch with head dynamics from smart glasses. To filter out irrelevant daily activities, we design a two-stage detection pipeline. The first stage identifies eating states by distinguishing characteristic temporal patterns from non-eating behaviors. The second stage then conducts fine-grained food type recognition based on the motions captured during food intake. To evaluate CuisineSense, we construct a dataset comprising 27.5 hours of IMU recordings across 11 food categories and 10 participants. Experiments demonstrate that CuisineSense achieves high accuracy in both eating state detection and food classification, offering a practical solution for unobtrusive, wearable-based dietary monitoring.The system code is publicly available at https://github.com/joeeeeyin/CuisineSense.git.
- Health & Medicine > Consumer Health (1.00)
- Education > Health & Safety > School Nutrition (1.00)
IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition
Liu, Rui, Mahammad, Zahiruddin, Bhaskar, Amisha, Tokekar, Pratap
Robotic assistive feeding holds significant promise for improving the quality of life for individuals with eating disabilities. However, acquiring diverse food items under varying conditions and generalizing to unseen food presents unique challenges. Existing methods that rely on surface-level geometric information (e.g., bounding box and pose) derived from visual cues (e.g., color, shape, and texture) often lacks adaptability and robustness, especially when foods share similar physical properties but differ in visual appearance. We employ imitation learning (IL) to learn a policy for food acquisition. Existing methods employ IL or Reinforcement Learning (RL) to learn a policy based on off-the-shelf image encoders such as ResNet-50. However, such representations are not robust and struggle to generalize across diverse acquisition scenarios. To address these limitations, we propose a novel approach, IMRL (Integrated Multi-Dimensional Representation Learning), which integrates visual, physical, temporal, and geometric representations to enhance the robustness and generalizability of IL for food acquisition. Our approach captures food types and physical properties (e.g., solid, semi-solid, granular, liquid, and mixture), models temporal dynamics of acquisition actions, and introduces geometric information to determine optimal scooping points and assess bowl fullness. IMRL enables IL to adaptively adjust scooping strategies based on context, improving the robot's capability to handle diverse food acquisition scenarios. Experiments on a real robot demonstrate our approach's robustness and adaptability across various foods and bowl configurations, including zero-shot generalization to unseen settings. Our approach achieves improvement up to $35\%$ in success rate compared with the best-performing baseline.
The Restaurant Meal Delivery Problem with Ghost Kitchens
Neria, Gal, Hildebrandt, Florentin D, Tzur, Michal, Ulmer, Marlin W
Restaurant meal delivery has been rapidly growing in the last few years. The main challenges in operating it are the temporally and spatially dispersed stochastic demand that arrives from customers all over town as well as the customers' expectation of timely and fresh delivery. To overcome these challenges a new business concept emerged, "Ghost kitchens". This concept proposes synchronized food preparation of several restaurants in a central complex, exploiting consolidation benefits. However, dynamically scheduling food preparation and delivery is challenging and we propose operational strategies for the effective operations of ghost kitchens. We model the problem as a sequential decision process. For the complex, combinatorial decision space of scheduling order preparations, consolidating orders to trips, and scheduling trip departures, we propose a large neighborhood search procedure based on partial decisions and driven by analytical properties. Within the large neighborhood search, decisions are evaluated via a value function approximation, enabling anticipatory and real-time decision making. We show the effectiveness of our method and demonstrate the value of ghost kitchens compared to conventional meal delivery systems. We show that both integrated optimization of cook scheduling and vehicle dispatching, as well as anticipation of future demand and decisions, are essential for successful operations. We further derive several managerial insights, amongst others, that companies should carefully consider the trade-off between fast delivery and fresh food.
- North America > United States > Iowa (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
- Information Technology > Architecture > Real Time Systems (0.88)
Vision-Based Approach for Food Weight Estimation from 2D Images
Wimalasiri, Chathura, Sahoo, Prasan Kumar
In response to the increasing demand for efficient and non-invasive methods to estimate food weight, this paper presents a vision-based approach utilizing 2D images. The study employs a dataset of 2380 images comprising fourteen different food types in various portions, orientations, and containers. The proposed methodology integrates deep learning and computer vision techniques, specifically employing Faster R-CNN for food detection and MobileNetV3 for weight estimation. The detection model achieved a mean average precision (mAP) of 83.41\%, an average Intersection over Union (IoU) of 91.82\%, and a classification accuracy of 100\%. For weight estimation, the model demonstrated a root mean squared error (RMSE) of 6.3204, a mean absolute percentage error (MAPE) of 0.0640\%, and an R-squared value of 98.65\%. The study underscores the potential applications of this technology in healthcare for nutrition counseling, fitness and wellness for dietary intake assessment, and smart food storage solutions to reduce waste. The results indicate that the combination of Faster R-CNN and MobileNetV3 provides a robust framework for accurate food weight estimation from 2D images, showcasing the synergy of computer vision and deep learning in practical applications.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (9 more...)
- Health & Medicine > Consumer Health (0.94)
- Education > Health & Safety > School Nutrition (0.47)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
To Ask or Not To Ask: Human-in-the-loop Contextual Bandits with Applications in Robot-Assisted Feeding
Banerjee, Rohan, Jenamani, Rajat Kumar, Vasudev, Sidharth, Nanavati, Amal, Dean, Sarah, Bhattacharjee, Tapomayukh
Robot-assisted bite acquisition involves picking up food items that vary in their shape, compliance, size, and texture. A fully autonomous strategy for bite acquisition is unlikely to efficiently generalize to this wide variety of food items. We propose to leverage the presence of the care recipient to provide feedback when the system encounters novel food items. However, repeatedly asking for help imposes cognitive workload on the user. In this work, we formulate human-in-the-loop bite acquisition within a contextual bandit framework and propose a novel method, LinUCB-QG, that selectively asks for help. This method leverages a predictive model of cognitive workload in response to different types and timings of queries, learned using data from 89 participants collected in an online user study. We demonstrate that this method enhances the balance between task performance and cognitive workload compared to autonomous and querying baselines, through experiments in a food dataset-based simulator and a user study with 18 participants without mobility limitations.
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.88)
- Research Report > Experimental Study (0.67)
Food Portion Estimation via 3D Object Scaling
Vinod, Gautham, He, Jiangpeng, Shao, Zeman, Zhu, Fengqing
Image-based methods to analyze food images have alleviated the user burden and biases associated with traditional methods. However, accurate portion estimation remains a major challenge due to the loss of 3D information in the 2D representation of foods captured by smartphone cameras or wearable devices. In this paper, we propose a new framework to estimate both food volume and energy from 2D images by leveraging the power of 3D food models and physical reference in the eating scene. Our method estimates the pose of the camera and the food object in the input image and recreates the eating occasion by rendering an image of a 3D model of the food with the estimated poses. We also introduce a new dataset, SimpleFood45, which contains 2D images of 45 food items and associated annotations including food volume, weight, and energy. Our method achieves an average error of 31.10 kCal (17.67%) on this dataset, outperforming existing portion estimation methods.
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Consumer Health (0.68)
- Education > Health & Safety > School Nutrition (0.68)
Adaptive Visual Imitation Learning for Robotic Assisted Feeding Across Varied Bowl Configurations and Food Types
Liu, Rui, Bhaskar, Amisha, Tokekar, Pratap
In this study, we introduce a novel visual imitation network with a spatial attention module for robotic assisted feeding (RAF). The goal is to acquire (i.e., scoop) food items from a bowl. However, achieving robust and adaptive food manipulation is particularly challenging. To deal with this, we propose a framework that integrates visual perception with imitation learning to enable the robot to handle diverse scenarios during scooping. Our approach, named AVIL (adaptive visual imitation learning), exhibits adaptability and robustness across different bowl configurations in terms of material, size, and position, as well as diverse food types including granular, semi-solid, and liquid, even in the presence of distractors. We validate the effectiveness of our approach by conducting experiments on a real robot. We also compare its performance with a baseline. The results demonstrate improvement over the baseline across all scenarios, with an enhancement of up to 2.5 times in terms of a success metric. Notably, our model, trained solely on data from a transparent glass bowl containing granular cereals, showcases generalization ability when tested zero-shot on other bowl configurations with different types of food.
LAVA: Long-horizon Visual Action based Food Acquisition
Bhaskar, Amisha, Liu, Rui, Sharma, Vishnu D., Shi, Guangyao, Tokekar, Pratap
Robotic Assisted Feeding (RAF) addresses the fundamental need for individuals with mobility impairments to regain autonomy in feeding themselves. The goal of RAF is to use a robot arm to acquire and transfer food to individuals from the table. Existing RAF methods primarily focus on solid foods, leaving a gap in manipulation strategies for semi-solid and deformable foods. This study introduces Long-horizon Visual Action (LAVA) based food acquisition of liquid, semisolid, and deformable foods. Long-horizon refers to the goal of "clearing the bowl" by sequentially acquiring the food from the bowl. LAVA employs a hierarchical policy for long-horizon food acquisition tasks. The framework uses high-level policy to determine primitives by leveraging ScoopNet. At the mid-level, LAVA finds parameters for primitives using vision. To carry out sequential plans in the real world, LAVA delegates action execution which is driven by Low-level policy that uses parameters received from mid-level policy and behavior cloning ensuring precise trajectory execution. We validate our approach on complex real-world acquisition trials involving granular, liquid, semisolid, and deformable food types along with fruit chunks and soup acquisition. Across 46 bowls, LAVA acquires much more efficiently than baselines with a success rate of 89 +/- 4% and generalizes across realistic plate variations such as different positions, varieties, and amount of food in the bowl. Code, datasets, videos, and supplementary materials can be found on our website.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- Asia > China (0.04)
Yelp Reviews and Food Types: A Comparative Analysis of Ratings, Sentiments, and Topics
Liao, Wenyu, Shi, Yiqing, Hu, Yujia, Quan, Wei
This study examines the relationship between Yelp reviews and food types, investigating how ratings, sentiments, and topics vary across different types of food. Specifically, we analyze how ratings and sentiments of reviews vary across food types, cluster food types based on ratings and sentiments, infer review topics using machine learning models, and compare topic distributions among different food types. Our analyses reveal that some food types have similar ratings, sentiments, and topics distributions, while others have distinct patterns. We identify four clusters of food types based on ratings and sentiments and find that reviewers tend to focus on different topics when reviewing certain food types. These findings have important implications for understanding user behavior and cultural influence on digital media platforms and promoting cross-cultural understanding and appreciation.
- Asia > China > Guangdong Province > Zhuhai (0.05)
- North America > United States > Massachusetts (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Africa > Mauritius (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.88)
- Health & Medicine > Consumer Health (1.00)
- Consumer Products & Services > Restaurants (1.00)
Automated Interactive Domain-Specific Conversational Agents that Understand Human Dialogs
Zeng, Yankai, Rajasekharan, Abhiramon, Padalkar, Parth, Basu, Kinjal, Arias, Joaquín, Gupta, Gopal
Achieving human-like communication with machines remains a classic, challenging topic in the field of Knowledge Representation and Reasoning and Natural Language Processing. These Large Language Models (LLMs) rely on pattern-matching rather than a true understanding of the semantic meaning of a sentence. As a result, they may generate incorrect responses. To generate an assuredly correct response, one has to "understand" the semantics of a sentence. To achieve this "understanding", logic-based (commonsense) reasoning methods such as Answer Set Programming (ASP) are arguably needed. In this paper, we describe the AutoConcierge system that leverages LLMs and ASP to develop a conversational agent that can truly "understand" human dialogs in restricted domains. AutoConcierge is focused on a specific domain-advising users about restaurants in their local area based on their preferences. AutoConcierge will interactively understand a user's utterances, identify the missing information in them, and request the user via a natural language sentence to provide it. Once AutoConcierge has determined that all the information has been received, it computes a restaurant recommendation based on the user-preferences it has acquired from the human user. AutoConcierge is based on our STAR framework developed earlier, which uses GPT-3 to convert human dialogs into predicates that capture the deep structure of the dialog's sentence. These predicates are then input into the goal-directed s(CASP) ASP system for performing commonsense reasoning. To the best of our knowledge, AutoConcierge is the first automated conversational agent that can realistically converse like a human and provide help to humans based on truly understanding human utterances.
- North America > United States > Texas > Dallas County > Richardson (0.14)
- North America > United States > Texas > Collin County > Plano (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)